How Economic, Demographic, and Health Factors Influence Hygiene Coverage in Healthcare Facilities
Name: Divij Singh Bankal ID: A15463
Introduction
Hygiene coverage in healthcare facilities is a critical component of public health infrastructure, directly influencing patient outcomes, infection control, and broader community health. Despite its importance, significant disparities persist across countries, shaped by a complex interplay of economic, demographic, and healthcare-related factors.
In this report, we systematically explore how indicators such as GDP per capita, population size, and life expectancy correlate with hygiene coverage across healthcare facilities globally. We aim to uncover patterns and potential leverage points for targeted public health interventions and investments.
Data Sources
Our analysis is based on two primary datasets obtained from UNICEF: - unicef_indicator_1.csv: This dataset provides country-level data on hygiene coverage, representing the proportion of healthcare facilities with access to basic hygiene services. - unicef_metadata.csv: This dataset contains country metadata, including important socioeconomic indicators such as GDP per capita (constant 2015 US$), total population, and life expectancy at birth.
These datasets were merged on the country level, ensuring a comprehensive and cohesive framework for our analysis.
Hygiene Observations vs Life Expectancy (Bubble Plot)
One of the key visualizations we created is a bubble scatter plot, designed to intuitively reveal relationships between hygiene coverage, economic factors, and health outcomes: - The x-axis represents the proportion of healthcare facilities with basic hygiene services (“Hygiene Observation Value”). - The y-axis represents the life expectancy at birth in years. - The size of each bubble corresponds to the total population of each country, giving a sense of scale. - The color of each bubble reflects the country’s GDP per capita, with higher GDP shown in lighter shades.
This visualization highlights important global trends: - In general, countries with higher hygiene coverage tend to have higher life expectancy. - Larger populations often face more varied outcomes, indicating the influence of other systemic factors. - Economic prosperity (GDP per capita) often correlates with better hygiene and higher life expectancy, but with notable exceptions.
This initial plot sets the foundation for deeper exploration into how hygiene, health, and economic indicators intersect across different nations.
Code
import pandas as pdimport plotly.express as px# Load datasetsindicator_df = pd.read_csv('unicef_indicator_1.csv')metadata_df = pd.read_csv('unicef_metadata.csv')# Make column names lowercase for consistencyindicator_df.columns = indicator_df.columns.str.lower()metadata_df.columns = metadata_df.columns.str.lower()# Keep only needed columns from indicatorhygiene_df = indicator_df[['country', 'alpha_3_code', 'time_period', 'obs_value']]# Drop missing valueshygiene_df = hygiene_df.dropna()# Merge hygiene data with metadata on 'country'merged_df = pd.merge(hygiene_df, metadata_df, on='country', how='left')# Ensure 'time_period' is integermerged_df['time_period'] = merged_df['time_period'].astype(int)# Keep only the latest year per countrylatest_year_df = merged_df.loc[merged_df.groupby('country')['time_period'].idxmax()]# Drop rows where important columns are missinglatest_year_df = latest_year_df.dropna(subset=['population, total','gdp per capita (constant 2015 us$)','life expectancy at birth, total (years)','obs_value'])# Plot scatter: Hygiene observations vs Life Expectancyfig = px.scatter( latest_year_df, x='obs_value', y='life expectancy at birth, total (years)', hover_name='country', color='gdp per capita (constant 2015 us$)', size='population, total', title='Hygiene Observations vs Life Expectancy', labels={'obs_value': 'Hygiene Observation Value','life expectancy at birth, total (years)': 'Life Expectancy (Years)','gdp per capita (constant 2015 us$)': 'GDP per Capita (2015 US$)','population, total': 'Population' }, size_max=60, template='plotly_white')fig.show()
# Merge hygiene data with metadatamerged_df = pd.merge(hygiene_df, metadata_df, on='country', how='left')# Check merged datamerged_df.head()
country
iso_alpha
year_x
hygiene_coverage
alpha_2_code
alpha_3_code
numeric_code
year_y
population, total
gdp per capita (constant 2015 us$)
gni (current us$)
inflation, consumer prices (annual %)
life expectancy at birth, total (years)
military expenditure (% of gdp)
fossil fuel energy consumption (% of total)
gdp growth (annual %)
birth rate, crude (per 1,000 people)
hospital beds (per 1,000 people)
0
Andorra
AND
2000
100.0
AD
AND
20
1960
9510.0
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
1
Andorra
AND
2000
100.0
AD
AND
20
1961
10283.0
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
2
Andorra
AND
2000
100.0
AD
AND
20
1962
11086.0
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
3
Andorra
AND
2000
100.0
AD
AND
20
1963
11915.0
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
4
Andorra
AND
2000
100.0
AD
AND
20
1964
12764.0
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
Global Hygiene Coverage Map
To visualize the geographical distribution of hygiene standards, we generated a world map highlighting the hygiene coverage in healthcare facilities for each country.
This map uses color intensity to represent the proportion of healthcare facilities meeting basic hygiene requirements:
- Darker shades indicate higher hygiene coverage (closer to 100%).
- Lighter shades indicate lower hygiene coverage, signaling areas where access to basic hygiene services may need attention.
By mapping the data globally, we can easily spot regional patterns, identify countries excelling in healthcare hygiene, and recognize those that might require more policy focus or investment. This spatial representation provides an intuitive and impactful overview, complementing the numerical analysis.
The map serves as a crucial starting point for deeper exploration into the factors influencing hygiene coverage across different regions.
Code
# Create a choropleth mapfig = px.choropleth( hygiene_df, locations="iso_alpha", color="hygiene_coverage", hover_name="country", color_continuous_scale="Greens", labels={'hygiene_coverage': 'Hygiene Coverage (%)'}, title="Proportion of Health Care Facilities with Basic Hygiene Services")fig.update_layout(geo=dict(showframe=False, showcoastlines=False))fig.show()
Relationship Between GDP per Capita and Hygiene Coverage
To explore the association between economic prosperity and healthcare facility hygiene, we created a bubble plot showing GDP per capita (constant 2015 US$) on the x-axis and hygiene coverage on the y-axis.
Each bubble represents a country, and the size of the bubble reflects the total national population.
This visualization helps us understand whether wealthier countries tend to have better hygiene coverage in healthcare facilities, while also considering population differences.
It also highlights countries that may have high hygiene standards despite lower economic resources, and vice versa.
Code
# Keep the latest record per country based on year_xlatest_data = merged_df.sort_values('year_x').groupby('country').tail(1)# Scatter plot with one point per countryfig = px.scatter( latest_data, x="gdp per capita (constant 2015 us$)", y="hygiene_coverage", hover_name="country", trendline="ols", labels={"gdp per capita (constant 2015 us$)": "GDP per Capita","hygiene_coverage": "Hygiene Coverage (%)" }, title="GDP per Capita vs Hygiene Coverage (One point per Country)")fig.update_layout(template="plotly_white")fig.show()
Relationship Between Population and Hygiene Coverage
To further investigate factors influencing hygiene standards, we plotted Population (total number of people) against Hygiene Coverage.
This plot helps reveal whether countries with larger populations face greater challenges in maintaining high hygiene standards in healthcare facilities.
By analyzing this relationship, we can identify if smaller or larger countries tend to achieve better hygiene coverage, and spot any notable outliers that may warrant further exploration.
Code
import plotly.graph_objects as go# Use latest available year (already extracted)latest_year_df = merged_df.loc[merged_df.groupby('country')['year_x'].idxmax()]# Sort by population and pick top 30top30_df = latest_year_df.sort_values('population, total', ascending=False).head(30)fig = go.Figure()# Bar for Populationfig.add_trace(go.Bar( x=top30_df['country'], y=top30_df['population, total'], name='Population', yaxis='y1'))# Line for Hygiene Coveragefig.add_trace(go.Scatter( x=top30_df['country'], y=top30_df['hygiene_coverage'], name='Hygiene Coverage (%)', yaxis='y2', mode='lines+markers'))# Layout settingsfig.update_layout( title="Population and Hygiene Coverage (Top 30 Countries)", xaxis=dict(title="Country"), yaxis=dict(title="Population", side="left"), yaxis2=dict(title="Hygiene Coverage (%)", overlaying="y", side="right"), legend=dict(x=0.5, y=1.1, orientation="h"), template="plotly_white", height=600)fig.show()
Life Expectancy and Hygiene Coverage
Finally, we explore the relationship between Life Expectancy and Hygiene Coverage. Life expectancy is a strong overall indicator of health outcomes in a country. By plotting it against hygiene coverage, we can assess whether countries with better healthcare hygiene standards also experience longer average lifespans. This time series chart tracks how hygiene coverage has evolved over time for various countries. The chart highlights trends, where bubble size represents population, showing how improvements in healthcare infrastructure, particularly hygiene, may correlate with broader health benefits at the population level. Notably, Egypt stands out as a country with significant progress in hygiene coverage over the years, reflecting improvements in healthcare standards. On the other hand, Tajikistan did not show any improvement, maintaining a constant level of hygiene coverage. Additionally, Nigeria experienced a degradation in its hygiene coverage percentage, highlighting challenges in maintaining or improving healthcare hygiene despite other factors. This analysis underscores the dynamic nature of healthcare systems and the importance of continuous improvement in hygiene standards to ensure better public health outcomes.
Code
# Filter countries that have hygiene coverage data across multiple yearsvalid_countries = ( merged_df.dropna(subset=["hygiene_coverage"]) .groupby('country') .filter(lambda x: x['year_x'].nunique() >3) # countries with more than 3 years of data ["country"] .unique())# Filter dataset for those valid countriestime_series_df = merged_df[merged_df["country"].isin(valid_countries)]# Sort for clean linestime_series_df = time_series_df.sort_values(["country", "year_x"])# Line plotfig = px.line( time_series_df, x="year_x", y="hygiene_coverage", color="country", markers=True, labels={"year_x": "Year","hygiene_coverage": "Hygiene Coverage (%)","country": "Country" }, title="Hygiene Coverage Over Time (Countries with Sufficient Data)")fig.update_layout(template="plotly_white")fig.show()
Conclusion
In conclusion, the analysis revealed important insights into the relationship between hygiene coverage, economic prosperity, and health outcomes. We observed that countries with higher GDP tend to have better hygiene standards in healthcare facilities, which in turn correlates with higher life expectancy. However, there were notable exceptions, such as Nigeria, where hygiene coverage decreased over time. Additionally, Egypt showed significant improvements in hygiene coverage, highlighting the importance of continuous efforts in healthcare infrastructure. Countries like Tajikistan, which did not show any improvement, indicate the need for targeted interventions and policies to enhance hygiene standards in healthcare facilities. Overall, the findings emphasize the critical role of hygiene in promoting public health and the need for sustained investments to improve healthcare systems globally.